Project Goal
The goal of this project was to automate the creation of visually compelling LinkedIn carousel images using AI, specifically by leveraging the strengths of Google’s Gemini and OpenAI’s ChatGPT. The aim was to streamline the image creation process, making it faster and more accessible without requiring graphic design skills.
Medium post containing step-by-step instructions
How it was built
This project was built in three key steps:
- Gemini Gem Creation: A custom Gemini Gem was created, powered by Gemini 2.5, to interpret user requests and translate them into structured JSON objects. This Gem was given a flexible JSON template that could be adapted for various visual content needs.
- Carousel Template Generation: The Gemini Gem was then fed prompts related to specific LinkedIn post content, such as code snippets or project descriptions. The Gem generated a series of tailored JSON objects, each designed to become a slide in a LinkedIn carousel.
- Image Generation with ChatGPT: Finally, the structured JSON objects were pasted into ChatGPT (model 4o), which used them to generate visually appealing, professional-quality images. While currently a manual process, future iterations are expected to automate this step further.
The architecture relies on Gemini’s ability to interpret structured JSON instructions and ChatGPT’s robust image generation capabilities, including effective text handling within images. The flexible JSON template allows for detailed control over image style, text placement, composition, and more.
Technologies used
- Google Gemini (with custom Gem functionality)
- OpenAI ChatGPT (model 4o)
- JSON (for structured data representation)