# Building Vertical AI Startups and Mastering AI Visuals

**Podcast:** The Startup Ideas Podcast
**Published:** 2026-04-22

## Transcript

It's a big day because chat GPT Images 2.0 just came out and we're going to take it for a spin.
We're going to actually see which use cases you can use to make money from it and actually help you build your business.
I'm going to go through a bunch of use cases.
By the end of this episode, you're going to be just feeling comfortable and knowing how to work with this thing.
But that's not all.
We're not just going to go through chat GPT Images 2.0 in this episode.
I'm also going to give you a startup idea that I think someone...
Please, please deal.
I'm also going to give you an AI tool that I think is really, really interesting that I wish more people were talking about.
And I think you're going to really like it.
And we're going to talk about a framework that people can use to come up with vertical AI agent startup ideas to make money.
It's one of the spaces I think are just ripe for opportunity.
So I'm going to give away a framework for that.
So we've got all that covered in this episode, so it's going to get your creative juices flowing.
But first, let's start with ChatGPT Images 2.0.
So what are the three things that the old one couldn't do?
Like what's actually changed here?
Well, the first thing is...
It's going to be able to go to 2K resolution.
You can get to 3-1 aspect ratios, and it's spitting out eight images per prompt.
So that's a big deal.
It feels more powerful.
The second thing, and I know a lot of people are going to be happy to hear this, it's rendering dense and tiny multi-language text correctly.
So one of the biggest problems with...
not just ChatGPT images as a product, but really a lot of the creative LLMs is getting text wrong.
Now, images 2.0 in my testing, I will say, hasn't been right 100% of the time.
I won't say it's perfect, but I will say it's noticeably better than the last version.
Also, the ability to have multi-languages there, Japanese, Korean, Chinese, Hindi, I'm sure a lot of people are going to be happy about that.
The type of creative that you can use 2.0 for is just a lot more vast, so now you can use it for things like UI.
Before, I really wouldn't use it for UI.
I wouldn't use it, even infographics.
It wasn't that good.
Packaging, it's actually pretty good at packaging.
Poster, stuff like that.
And the third thing, and for me this is probably the most interesting, is it has what's called thinking mode.
So basically when you prompt images 2.0, it's not just going to look at your prompt and then just spit something out.
It's going to look at your prompt, but also search the web, do some fact checking, and generate up to eight related images that stay consistent with one another.
So it's kind of obvious in hindsight that it should actually crawl the web, but happy it's finally in there.
Thank you, Sam Altman.
So let's actually go through some prompts that I did that I think people should steal.
I can include...
Some of them in the description.
Let me know.
So the first is my buddy Sahil Bloom has a skincare line called Wild Roman.
So I basically was like, create a visual style reference for Wild Roman, a modern skincare brand for men 25 to 40.
So I basically give it the aesthetic.
I give it the film camera.
So I said it's a context.
t2 film camera i give it the net uh you know the lighting I give it the color palette, warm creams, some bleach terracotta.
I give it the mood, Mediterranean lifestyle.
And then of course I give it the subjects, like who's actually going to be in the photos.
Real humans, natural posture, slight imperfection.
Important to include slight imperfections because otherwise you get stock photos.
So that's something I learned over time.
So I asked it to generate eight.
Example images, a product shot, a lifestyle shot, a hand shot, a detailed texture shot, an environmental shot, a lifestyle portrait, a packaging flat lay, and an ingredient story shot.
And images 2.0 just cooked.
Does that look AI to you?
Does that look AI to you?
These look like real photos.
That product looks like something that people would buy.
And I just think that this...
It's really good at this.
But you do need to have the aesthetic dial.
Then you have to be very specific with 2.0.
Really, really specific.
If you're not specific, what I'm finding is you get really stock-looking images.
You'll see some of that in some of the other examples I'll give.
then you're just going to quit.
I highly recommend you copy this.
You can also use ChatGPT to help you.
Maybe you don't know the difference between a Context 2T film camera, 35mm and 40mm.
I didn't, so I use ChatGPT to help me figure this out.
This is a good opportunity to create brand books, packaging, There's a startup idea right there, selling this as a service to companies.
I think a lot of people would pay for it.
Another prompt I did, this was to try to basically get photos that I can basically use to then plug into C-Dance 2.0 to create.
ads and commercials.
So one of the things that you're going to need if you're going to create, if you've seen these AI videos, these commercials that are incredible, and I've done tutorials around how to create them with my friend PJAs, one of the things you need is a visual direction.
So I said I'm developing a Super Bowl-style ad, 30-second ad for Shopify, and I have this whole storyline where an employee hates his boss and he creates a football merch line.
He basically sells it on Shopify and then he ends up quitting his job full time.
And I basically said, generate eight different visual directions for the same story.
Wes Anderson, Apple shot on iPhone, Nike just do it.
And you can see that some of these are...
good and you'll see some of them are not good so like the wes anderson like this is wes anderson vibe like as a reference images as a set of references images that you'd actually create uh and and put into a c dance too this is really good although what i don't like about this is this looks like jason schwartzman so if you listen or sorry if you watch um wes anderson films This just looks like an image of Jason Schwartzman, who plays in a lot of his films.
So it's just a little too close, right?
But the vibe is definitely correct.
This looks a lot like Bill Murray, again, a little too much.
But I think they did a really good vibe check on that.
The Apple shot on iPhone looks like stock photos.
The Nike Just Do It, I think, looks really good.
come on does that look like ai that does not look like ai to me and i think like the jacket is on point the jeans are on point um these two don't look like nike just do it to me but i think this one is really good uh cinematic images 2.0 is extremely good at cinematic use that word a lot uh this is not good this is really good so you know Two on three in cinematic.
So I highly recommend you use cinematic here.
The other one's not so good.
Not so good.
Not bad here.
Not bad.
Not bad.
Crazy.
I don't like that.
And then Modern Minimalist.
This is okay.
This is good.
That one.
So one on three here.
So some hits, some misses here.
I think for me, the best one was Wes Anderson, the Nike Just Do It.
Anything cinematic is going to work.
So the lesson for me on this one was you can't just say, I want generate a visual direction, Wes Anderson.
It needs to be way more in-depth.
For example, the first example I gave, I was a lot more in-depth around giving constraints around some of these references.
Please, please do that.
I want to give a couple more examples.
UI, let's see how it is with UI.
I said, hey, generate a UI mock-up for a new feature in Idea Browser.
And then I said, go look at the app at ideabrowser.com because you can tell it to go and do things.
And I give it a feature.
I say, I want a new leaderboard with the biggest idea junkies because it's a website for people into startup ideas and a way for people to meet co-founders.
I give it a visual style.
I say research the style, again, because it can go ahead and do that.
I say I want it clean, a fresh, a hint of color, a modern SAS app.
And then I ask it to give it four variations of the same screen.
Important to note, when doing UI in images 2.0, you're going to want to include...
your outputs.
You're going to want to say, I want this resolution.
I want native Mac OS window Chrome.
I want realistic data in every cell is really important too.
Because I noticed that if you don't, you're just going to get bad data.
So overall, it did exactly what I asked.
It did the Kanban.
pretty modern and clean.
Like I asked it, it did the, it put it in Chrome.
Is it the most beautiful thing?
No, but I didn't ask it to make it the most beautiful thing.
And this was just a one shot.
And to prove my point that you do need to figure out, you do need to give it the output very specifically.
I basically said, hey, I want this feature as a mobile app.
And it like made this like super long mobile app.
Obviously this is not typical iOS dimensions.
So you'd have to say, I just did that as a test to see, is it going to give me typical iOS?
It did not.
So again, you want to do long, long, long, long prompts for images 2.0.
That's just the way you're going to get the most out of it.
A couple more quick ones and then we're going to move on to startup ideas and stuff like that.
It's really good at apparel.
I basically said, give it a photorealistic product photography of a new apparel item I'm launching called Fourth Wave.
I said, generate six shots.
This looks great.
If you're building something and you want a merch line and you don't want to actually create the merch, you want to see if people would buy it before you sell it, use something like Images 2.0.
Because I think it looks quite good.
Last thing I'll say is it's gotten a lot better at just illustrations.
So the illustrations before 2.0, they all look the same.
But I tested creating an editorial illustration in the style of New York Times op-ed, flat vector, limited three color palette.
use ChatGP to help me create that.
I basically started off in saying I want something in the New York Times op-ed section and it gave me the language to use.
Look at that.
It's helpful.
How is this helpful from a business perspective?
If I'm doing proposals and I want it to look really good, I'm going to use Images 2.0.
if I'm doing a one-pager, if I'm doing a PowerPoint presentation, things like that, and I want to spruce it up with illustrations, now that you can also output eight of them, super, super helpful.
I want to just remind people before we move on from 2.0 that every business has four creative bottlenecks.
One is making new marketing content.
Two is making internal content like decks, docs and training.
So that's kind of like what we were talking about a little bit with the illustrations, sprucing those up.
Third is explaining things visually.
If you can do that, you're going to get more customers.
You're just increasing the odds of success.
And four is testing before building, like we talked about with creating UI before building it or creating a merch line before actually printing the merch and spending money on it.
So use 2.0 to help you figure out these things.
That being said, and I have no affiliation to this company, I think that Glyph which is basically like a creative LLM super agent, is really, really good and worth using too.
So this is my go-to.
For example, I create my own YouTube thumbnails.
And I use this to come up with my own YouTube thumbnails.
And it just helps, I find, get really good outcomes.
So like I said, no affiliation.
I'm going to be using Glyph.
I'm going to be using Images 2.0.
I use Nano Banana Pro.
I use all of these.
To be honest, I've obviously tried Images many times, but I've never been able to make it into my cocktail or my Swiss Army knife.
Nano Banana and Glyph and stuff like that.
And it's now in the rotation.
I'm going to use it.
Is it better than Glyph?
Is it better than Nano Banana Pro?
TBD.
But it's certainly exceeded my expectations so far.
So that's my quick review slash news item of ChatGPT Image 2.0.
is here oh actually and before i go on before i uh move on i just wanted to just say whether you use images 2.0 or use any creative any creative lm like a glyph or nano banana pro for any asset you're going to nail you're going to need to nail these five things context so what is the asset for is it for a landing page is it for an app store is it for a deck you saw how important it was when i actually put in images 2.0 the output for I need it to be in this size.
They need it.
If you don't have that, if you don't say that I need a YouTube thumbnail this size, it just likely isn't going to work.
Two, style references.
Name companies whose visuals you want to look like.
You saw that in some of my prompts like Stripe, Linear, Vercel, companies you look up to.
And then use ChatGPT or Claude or whatever to help you create those prompts.
Because you might just be like, I want it to look like linear.
But you don't have the vocabulary for what that actually means unless you're a designer.
Or if you're trying to do a photo shoot, for example, like we did with the Wild Roman thing, you might not know a lot about cameras or you might not know a lot about lighting.
Ask the LLM to help you figure out what those things are.
Palette, number three, specific hex codes work.
So using hex codes is really important.
Four, obviously copy.
You're going to want real plausible text.
You don't want to create lorem ipsum when you're creating your outputs.
And finally, five, aspect ratios and resolutions important so it drops into production without any rework.
So there you have it, images 2.0.
So that's images 2.0.
Now we're switching gears.
Now I want to talk to you about a tweet I saw that put me into this new tool that I'd never heard of that I just started using.
I saw this tweet from Blake Robbins.
He goes, No scroll is one of the most magical experiences I've had.
Magical AI.
No scroll is one of the most magical AI experiences I've had in a while.
It opened my eyes to how AI agents might actually show up in daily life.
Not as one giant assistant, but as small focus products that do one thing insanely well.
So I never heard of this product.
No scroll.
Basically, if you go to their website, no affiliation, no scroll monitor is the situation, so you don't have to.
You set the agenda, it never stops reading.
It says here, your beats, your briefing.
It's like having the smartest person you know read everything you care about online 24-7 and text only you what matters.
If you click text your agent, you scan it with your phone and you add it to your iPhone and you have a conversation with it.
Now, I'm not going to blow it for you.
But I used this product for five minutes and I was blown away with how good it was.
And that was just five minutes.
So I'm just starting to get to use it.
I only used it for five minutes, but it did feel like an aha moment.
It did feel like this is...
how we're going to be using a lot of assistance going forward, how we're going to be using a lot of AI agents going forward.
And I'll give you an example.
In the onboarding, I don't know if you can see this, I added no scroll to my contact list.
I said, hey, no scroll.
And it basically said, Greg Eisenberg, you're the CEO of Late Checkout, the guy who turned fine startup ideas on Reddit into a whole empire.
And then it knew I had 158,000 newsletter subscribers, a podcast that probably has more episodes than most people have business ideas.
And then it knew that I signed up with email.
It says, signing up via email instead of X though, the man with 237,000 LinkedIn followers is going stealth mode on me with a winky face.
What brings you to no scroll?
And then I say ha ha ha.
And then it did a reaction to my ha ha ha.
Now this felt like I was talking to a person.
So whether no scroll is a big company or not in the future or makes it a part of your workflow, it is a glimpse into the future of how you're going to interact with these products.
So I think it's just worth it for that.
I was happy that Blake had posted that because I had never heard of it.
So that's the tool of the week.
It's called No Scroll and I think it is free to download.
Okay, the startup idea of the week is one that I saw on Idea Browser and I would really like for people to steal.
So the title says here, learn to draw app with AI feedback on every sketch.
Art education is built for people who are already drawing.
Long courses, YouTube tutorials, who assume context, the hobbyist who wants to improve.
And it gives this idea of Mark Day, a daily drawing trainer that builds skill through short, structured lessons.
Each one takes under 10 minutes and can...
connects directly to the last jester work builds into shading shading builds into form ai feedback on uploaded sketches catches the specific mistakes a tutorial never pauses to address a proportion in issue in the left shoulder a shading gradient that flattens instead of rounds the habit is the product 10 minutes three sessions a week and every uploaded sketch feed feeds a model that learns where beginners break down so basically it's this mobile app charging $5 a month or $50 a year that help people learn to draw.
And I think in this AI age, more and more people are going to want to do art and it's not fun if you're not improving or getting good at it.
So I think that this is a big opportunity.
I screenshotted this Idea Browser idea and I put it into Claude design.
Literally, I just said, give me three directions based on this idea in wireframes.
And I want to go through this really quickly because I think that this is, first of all, beautifully done.
Claude design absolutely cooked with the three directions.
And my favorite thing to do with Claude design right now is using it for wireframes.
So it came up with three ideas that I think someone should steal.
I think all three of them are really good.
Daily habit.
Look at this mobile app.
So basically home goes into a lesson, goes into sketch, goes into AI feedback.
Look at this.
You can see straight lines, eclipse.
Okay, now you're doing curved lines and you can go and actually do it.
Chose the strengths and concerns.
It says strengths, low friction, feels casual.
And then there's open questions.
This is incredible.
Do we allow finger or stylus only?
Should there be a streak freeze or six day logic?
This is really cool.
Duolingo, be real energy.
I think this would work.
Then it goes and designs, even though the original idea from Idea Browser was a mobile app, it goes and says, you know what, we're actually going to go and create, which looks like a website or an iPad app.
So unclear, but I think that this also looks really cool.
It's the desktop app.
So side by side for power users.
So it says here, it's a big canvas, persistent AI critique rail, curriculum is in the sidebar.
And the vibe is like more of a Figma, Procreate, Tool, Sirius.
I don't think, I personally think it's easier to create the Daily Habit app and it would do well, but I actually think that this would also do well, just I wouldn't start with this.
And then there's the Ritual Journal.
So the Ritual Journal opens like a sketchbook.
So at the left you'll see today's concept and at the right you'll see a blank page plus a gentle critique.
So this is more of like a Moleskine Kindle meditation app quiet thing where the core loop is an open book.
You open book, read concept, you sketch on it and in the AI there's margin notes and closed books.
So more of like a book vibe.
Really cool, really, really forward thinking.
I honestly wouldn't have thought of this and really interesting.
But my favorite really is I would say the Daily Habit followed by Studio Canvas followed by the Ritual Journal.
Someone please steal this idea and then shout out to Claude Design for cooking so darn much on this.
Let's talk framework.
So the framework of the week is how do you find a vertical AI business to build?
So first of all, of course it's cool to build something like no scroll, like a horizontal app that a billion people could use.
But if you want to build a business that has the highest likelihood of a million dollars, five million, 10 million ARR, my thinking is a vertical AI business.
you have a more likelihood of success.
So what's a framework for coming up with an idea in vertical AI?
By the way, when I mean vertical AI, I just mean that it's an AI software business that focuses on a specific niche, a vertical, and a specific workflow.
So how do you actually go and find these ideas besides going on things like ideabrowser.com?
You're going to want to find a boring...
pain point now you you might be able to find a boring pain point in your job or through a friend or through an idea browser.com trends and stuff like that feature and stuff like that but you're going to want to figure out what that boring pain point is you're going to want to map out that entire workflow you can use Claude, Chaff, GPT, to actually map that out.
Or if you know that workflow in and out, maybe that's a business that you're in, you've been working in for five years, 10 years, even more.
If you know it, that's your unfair advantage.
So map it out.
And when I say map it out, I mean literally draw it out.
Or use Claude Design to draw it out for you.
Then you want to do the job.
Step three is to do the job as a service.
What do I mean by that?
Say you've been working in SEO for 10 years.
You've mapped out the workflow.
Let's say you worked in SEO, but you weren't actually doing the SEO.
You might actually want to do it.
Maybe you worked in operations, maybe you worked in finance, maybe you worked in who knows.
But the point is you want to actually do it so you can document, step four, the edge cases and failures.
Once you've done that, you've got the workflow.
First and foremost, you understand the pain point.
You've got the workflow.
You know how to do the job as a service.
You know how to get customers now into that job.
That might mean you've created an X account and you're creating content or you now understand the price point that people want.
Just by actually doing and creating the business and doing it manually, you're going to understand it way better than you did before.
The fifth step is actually adding vertical agents to replace the steps.
So that's when you actually create an agent as a software business.
So the mistake a lot of people make is they're like, I want to go and automate SEO, therefore I'm going to start by just creating an SEO.
I'm going to build an SEO agent.
But there's so much you're missing, right?
So you can use your clients, basically.
Say you have an SEO agency, you can use your clients as your first customers as an agent business.
So that's why I think that it's important to start by doing the boring pain point, mapping out the workflow, doing the job, and then document the edge cases and failures.
Because you're going to learn, maybe it's not you're doing SEO for everyone, maybe you're doing SEO for fintech only.
So you add vertical agents to replace the steps.
You leverage proprietary data.
Now you have proprietary data because say you have clients in the fintech space.
So now you know that particular vertical really well.
Then you iterate.
And you expand until the agent owns the workflow.
So maybe you started with just one small part about SEO and you expand beyond there.
So this is just something to get your creative juices flowing around building AI, vertical businesses, how to think about it, how to do it with the least amount of risk.
Even if you have a job, this is something that you can just get started with.
Hope that was helpful.
There's one last thing I want to end with.
And this is a quote that someone had actually sent to me.
And it really did fire me up.
It goes, he said, write it on your heart that every day is the best day in the year.
He is rich who owns the day and no one owns the day who allows it to be invaded with fret and anxiety.
Finish every day and be done with it.
You have done what you could.
some blunders and absurdities no doubt crept in forget them as soon as you can tomorrow is a new day begin it well and serenely with too high of a spirit to be cumbered with your old nonsense this new day is too dear with its hopes and invitations to waste a moment on the yesterdays i'm recording this right now at nine It was a long day for me.
I had some ups and I had some downs.
But when you read something like this, you kind of have to forget about the downs and forget about the ups too even.
And just wake up tomorrow and be like, I'm going to do my best.
And that's the way to think about every single day.
Conquering the day.
This is an incredible time to be...
building a business, working on ideas, getting what's in your head out there into the world and sharing with the world.
So I feel very lucky to be talking to you, the fact that I can record and thousands of people could get my ideas.
But I'm also just grateful that I and you listening could go and take your idea.
and put it out in the world and get people, strangers, to use that thing.
So what an incredible time we are.
And just a reminder, when I read this by Ralph Waldo Emerson, one of the great American people in literature, it just reminded me that even though building a startup is tough and even though life in general is tough and there's ups and downs, go and conquer the day.
You got this.
I'm rooting for you.
And I'll always be there to share every piece of alpha to help you get your creative juices flowing.
And I'll see you next time.
