Advent of GenAI Hackathon: Recap of Challenge 3

Eugenie_Wirz · ‎12-12-2023

Brace yourselves for the third challenge of our thrilling Advent of GenAI Hackathon! Dive into the challenge here: Advent of GenAI Challenge #03 - Flipbook Odyssey & generate an animated gif or a movie using stable diffusion text to image and image to image notebooks.

Challenge 3: Flipbook Odyssey - Released on December 7, 2023, 9:00 AM PST

Objective: Create an animated visual narrative through a sequence of image-to-image transformations using Stable Diffusion models. You can transform a base image step by step, following a chosen theme, to illustrate a progression or change. The final result can be a series of static images or a GIF that shows the transformation journey. Hints:

Starting Point: Begin with a base image that sets the stage for your narrative. This could be a simple sketch, a photograph, or any image that serves as a starting point for your story.
Image Transformation: Use the image-to-image capabilities of Stable Diffusion to transform your base image step by step. Each transformation should represent a progression or a change in your narrative.
Prompt Crafting: For each transformation, carefully craft prompts that guide the Stable Diffusion model. Be specific about the changes you want to see but remember to keep your prompts clear and not overly complicated to avoid confusion.
Sequence & Flow: Ensure that each transformation logically follows from the previous one, maintaining a clear and coherent flow in your narrative. Think of each image as a frame in a flipbook that when put together tells a complete story.
Experimentation: Experiment with different backgrounds, styles, and elements to see how they alter your narrative. For instance, changing the background to a snowy landscape or a rainstorm can add dramatic effects to your story.

Technical Setup: Utilize the provided notebooks for text to image generator and image to image generator and consider downloading custom models from Hugging Face if needed, such as inpainting models, to enhance your transformations. Set up your pipeline using the necessary libraries and the Stable Diffusion model.

The set of images combined together to create a dynamic experience. You can merge both text-to-image and image-to-image stable diffusion techniques using the 2 notebooks provided on IDC under Gen AI Essentials.

The Top Submissions of Challenge 3

We have selected five top projects from numerous submissions for the four nominations in this challenge.

A Digital Wizard: Circle of Life and Fate of a City by Tomáš Barczi

(view in My Videos)

(view in My Videos)

Commenting on the Flipbook challenge, Tomáš Barczi shared insights into his animated visual narrative concept, aiming to depict a family's growth. "The initial idea involved 'family portraits' evolving incrementally, but generating it posed challenges in consistency," he noted. Undeterred, Tomáš adopted a more manageable approach, generating and combining components manually. Unsatisfied, he researched and found a solution involving morphing between prompts for image generation. "Creating a storyboard and interpolating between embeddings improved consistency, making it easier to generate a cohesive video," Tomáš emphasized. This innovative method promises a more efficient and visually appealing outcome for the Flipbook challenge.

Digital artistry at its finest: The Justice of The War by Aditya Krishna RS and Thadeus Cruz Govindapillai

Process of Making:

First, we generated our base image using the Text-to-Image Model. Following this step, we utilized the Image-to-Image model to enhance each frame of the GIF. This involved incorporating negative prompts and adjusting the strength of Image-to-Image diffusion. A specific number of negative prompts were employed to achieve the desired negative effect.

(view in My Videos)

Aditya Krishna RS offers insights into his experience with the hackathon challenge, describing it as both interesting and tricky. Reflecting on the process, he notes, "the Day 1 challenge paved the way for us on how to use image-to-image and text-to-image diffusion models to experiment with." During the Text-to-Image phase, the team dedicated 40 to 50 attempts to secure a strong base image and a fitting theme for the flipbook challenge. Upon finalizing the base image, the Image-to-Image model played a crucial role in achieving the desired result. Aditya highlights their surprise at the accuracy of these models, especially with slight parameter adjustments, providing valuable hands-on experience in understanding their functionality.

Dream weaving in pixels: Alex’s Lunar Odyssey by Alvin Lee Neam Heng

Here are my prompts:

A young boy named Alex wearing a blue sweater standing in his backyard with a clear night sky, a curious expression on his face. anime art style
A young boy named Alex wearing a blue sweater inside his bedroom surrounded by books about space exploration, reading about the moon. anime art style
A young boy named Alex wearing a blue sweater presenting a model rocket he built himself. anime art style
A young boy named Alex wearing a blue space suit inside a spaceship and sitting in the pilot's seat. anime art style
A young boy named Alex wearing a blue spacesuit walking on the surface of the moon, surrounded by the vast lunar landscape. anime art style

(view in My Videos)

Alvin Lee explains his approach to address the challenge: "It was challenging to fine-tune the model to generate images that match my imagination. So, I had to understand how AI transforms images, then come up with a descriptive prompt that guides the model to make incremental changes. With the help of AI, a Harry Potter Magic newsletter might be coming soon."

Blending creativity and technique brilliantly: Canva of Life by Atif Ahmed

Atif Ahmed presented the project “Canva of Life: A Time-Lapse of human life” with a comprehensive documentation of the process.

(view in My Videos)

Process Documentation:

Generation of Seed Image.

Used Text-2-image generation. To Create the base image for the animation.
Model: stabilityai/stable-diffusion-xl-base-1.0
Prompt: green eye girl in her 20s with nice smile with short red hair, slight skin blemishes, front facing, gray blur background, studio photo shoot, clear shot, bright image, photography, portrait, real photo, photoshoot, beautiful

Selected one Image as the seed for remaining images.

Used image-2-image SDXL model to create the next image of animation, by slightly changing the image prompt. E.g. in this case a teenager.
Model: stabilityai/stable-diffusion-xl-base-1.0
Prompt : green eye teen age girl with nice smile with short red hair, slight skin blemishes, front facing, gray blur background, studio photo shoot, clear shot, bright image, photography, portrait, real photo, photoshoot, beautiful

Manually sort Image that looks youngest to oldest.

Pick the youngest looking image and use it for the next image sequence.

Again used image-2-image SDXL model to create the next image of animation, by slightly changing the image prompt. E.g. in this case a school girl
Model: stabilityai/stable-diffusion-xl-base-1.0
Prompt : green eye young school girl with nice smile with short red hair, slight skin blemishes, front facing, gray blur background, studio photo shoot, clear shot, bright image, photography, portrait, real photo, photoshoot, beautiful

Pick the youngest looking image and use it for the next image sequence.

Again used image-2-image SDXL model to create the next image of animation, by slightly changing the image prompt. E.g. in this case a toddler girl
Model: stabilityai/stable-diffusion-xl-base-1.0
Prompt : green eye toddler girl with nice smile with short red hair, slight skin blemishes, front facing, gray blur background, studio photo shoot, clear shot, bright image, photography, portrait, real photo, photoshoot, beautiful

Manually sort Image that looks youngest to oldest.

Pick the youngest looking image and use it for the next image sequence.

Change the prompt - rinse and repeat for infants.

For older age start from the first base image and change the prompt gradually for older age until you get all the sequence.

Goto GIF Maker-Imgflip to make image sequence to gif.

Unleashing waves of creativity: Adrift at Dawn by Amit Sadhu and Astronaut Dancing Hip-Hop by Vimal Menon

Bestowed with the distinguished title "Unleashing Waves of Creativity," two developers have contributed their exceptional works to this creative endeavor. The first piece is crafted by Amit Sadhu.

(view in My Videos)

Amit Sadhu noted, "From my experience in challenge 1, I learned that the stabilityai/stable-diffusion-2-1 model fares well in generating pictures of a person looking away from the camera, which gives it a better look without needing to render a face. I applied this learning to challenge 2, maintaining consistency with fundamental elements in the picture – the boat in the water and the color palette of a sunset. By choosing a picture with subtle but significant elements and finding the right balance of positive and negative prompts, I completed my piece, 'Adrift at dawn.' The hackathon by Intel and Prediction Guard has been a great learning experience and a lot of fun! I am certainly looking forward to the next challenges." Amit Sadhu's strategic approach showcases his adaptability and creativity in overcoming challenges, making his project a success.

The second honored project is the imaginative creation of Vimal Menon.

The prompt used was:

In the provided image, the character is {character details from the initial prompt}, and an expert {dance style chosen by the user} dancer performing. Based on the image given, what next dance move will the character perform? Move only hands and legs. Keep the background the same. Keep the character the same. Keep the character's face the same.

This prompt is iterated for the number of dance steps, feeding the previous image generated as a reference image. Once all the images are generated, they are combined into a GIF and saved.

For the animation created, the prompt used was:

1 astronaut in an orange suit dancing, realistic, pixar animation, 3d concept, 3d artistic render.

The negative prompt was:

Disfigured, ugly, bad, noisy, blurry, out of focus, immature, b&w, missing limbs.

And the dance style was hip-hop, with the model used being stabilityai/stable-diffusion-2-1.

Scope of improvements:

A new layer between the Text2Img and Img2Img layers can be built using an image recognition/annotation model to annotate the hands and legs of the character and then use the annotated masks for StableDiffusionInpaintPipeline.

(view in My Videos)

In approaching the challenge, Vimal Menon noted, "Taking cues from the previous 2 challenges, I thought it would be a good learning to implement the challenge programmatically." Considering multiple ideas, he contemplated modifying an initial image's background, citing the observed effectiveness when setting the strength parameter to 0.1 in the image2image API call. Another intriguing idea involved choreographing dance steps for a chosen character, adding a dynamic element to the hackathon challenge.

Vimal described his program's functionality in detail. Users provide an initial prompt specifying the character and dance parameters, including the number of dance steps and the dance style (tango, salsa, ballet, hip-hop, modern). Once these inputs are received, the program calls the Text2Img model to generate an initial image based on the user's prompt. Building upon this, the program uses the initial image as a reference to call the Img2Img model with a custom prompt, further refining the animated sequence.

Honorable Mentions

We are excited to honor the additional outstanding projects by Gopalakrishnan D, Faraz Ahmad, Priyanshu Pratap Singh, Pranav Raghavan CM, Ganesh Kumar, Kumari Pallavi, Ricky Ignatius, and Muhammad Aslam Khan Bin Mohd Rashid Khan.

Ready for more challenges like these? Join the Intel® Liftoff for Startups, continue your journey in AI innovation, and be part of a community who are constantly expanding the frontiers of AI.