top of page

Consistent Character Design with GenAI

Updated: Dec 12, 2023

In November 2023, following the creation of our initial concept art for the "Space Cruiser" project, we faced the challenge of developing 'consistent' assets. These assets were required to maintain a uniform appearance while varying in poses, clothing, and expressions. This posed a significant challenge. Did we succeed? Here's what we learned.

Key TakeAways:

  • 5 minutes to create 12 HD assets. What else?

  • ComfyUi emerged as a pivotal tool in streamlining our workflow, automating several repetitive tasks.

  • Achieving 'Consistent Emotions' for photorealistic characters was complex, particularly when conveying meaningful emotions. However, advancements in technology are enhancing this capability.

  • The automation in asset creation proved effective. With some initial preparation, we were able to produce high-quality assets efficiently.

Summary of assets generated

Comfy UI

The primary challenge in generating AI images lies in maintaining character consistency across different images. Our original idea was to generate various assets with a single click:

  • 1x Unit Portrait: A base image for use in the Gatcha, unit description, and menus.

  • 3x Unit Dialogs: Images with identical poses but different outfits (armor, armor + helmet, no armor), maintaining the same facial expression as the Unit Portrait. Used in dialogues.

  • 8x Unit Miniatures: Each displaying a distinct emotion for the game while retaining the same facial expression as the Unit Detail (Idle, Gimmick, Hurt, Pain, Afflicted, Focus, Warcry, Happy).

Given that Stable Diffusion couldn't facilitate the creation of these assets with a single click, we opted for ComfyUI. After several days of experimentation, we successfully established our workflow.

Our workflow in Comfy UI

Initially, ComfyUI appeared more complex than Automatic 1111. However, after a few hours of acclimatization, its full potential and efficiency in automated asset development became evident. With a single click, we were able to generate all the aforementioned assets and systematically save them in a designated folder. Furthermore, using the WildCard feature, we could seamlessly integrate new characters previously designed in Automatic 1111.

But how does our workflow operate? For the Unit Detail, the process was straightforward. We utilized the same parameters as in Automatic 1111 and adapted them for ComfyUI.

The processes for the other two asset types are detailed below.

Unit Detail

For Unit Portrait, there is not much to say that has not already been said in article n°5: Generated Photorealist Characters. We simply recovered our flow on Stable Diffusion that we transposed into Comfy UI. Then we added a wildcard to make batches of characters.

Note that after the design of this Unit Portrait, we create a similar one with a neutral expression, which will serve as a support to keep the face in memory.

Space cruiser concept

Unit Dialogs

In developing the Unit Dialog assets, we had three specific objectives:

  • Diverse Outfits for Each Pose: Initially, we attempted to modify the clothing in an image by creating a mask and incorporating a new prompt. However, due to the unpredictable nature of automatic clothing detection—even with advancements in ultralytics detector technology—this approach often resulted in imperfect masks. Consequently, we created separate prompts for each outfit, later injecting the pose and facial changes.

  • Multiple Poses: Our aim was not limited to a single pose. We utilized OpenPose with a batch of predefined images, generating approximately twenty different poses with each click.

  • Consistent Facial Expressions: The most challenging aspect, as we'll discuss next. Here, we employed the Reactor tool to extract the face from the Unit Detail and replace it with the face in the generated image.

Open pose + Reactor in ComfyUI

Unit Miniatures

The Unit Miniatures represented a culmination of these efforts and consumed the bulk of our time. We experimented with various combinations, though none were flawless.

For the neutral miniature, the process was straightforward. However, for the emotional expressions, we explored different methods. The ID Adapter, for instance, was used to generate an angry expression and merge it with the neutral one, but the results were suboptimal. Consequently, we turned to the Reactor tool. Yet, this sometimes led to the dilution of the intended emotion, akin to the outcomes from tools like Roop or FaceDetailer.

Our solution involved using specific prompts for each emotion, applied only in Clip L of the positive SDXL conditioning, while maintaining the 'Neutral emotion' image as a latent image. This approach aimed to preserve the original image's integrity to the greatest extent possible. While this method could suffice for drawn images, it fell short for photorealistic ones. Fortunately, given that these miniatures are for a mobile game and obscured by a user interface, they adequately serve their purpose.

Thee 8 emotions are used during battle


bottom of page