In September 2023, we restarted our "Space Cruiser" project and took the opportunity to switch to Stable Diffusion SDXL. Our goal was to create photorealistic characters in a Space Opera style. We created artistic concepts using Automatic 1111 to define our artistic style. Here is a summary of our findings.
Currently, SDXL represents a significant advancement in creating photorealistic assets, even though version 1.5 is not far behind.
Rendering times for SDXL have significantly increased. It is crucial to consider this factor when rendering and using extensions.
It is imperative to think about "consistent character" processes from the outset.
SD-1.5 to SD-XL
A few months ago, we conducted a series of tests on Space Opera characters using Automatic 1111. The initial results were promising, but the artistic style still leaned a bit too much toward a cartoonish look for our taste.
With the arrival of SDXL in Stable Diffusion, we decided to revisit our tests. In summary, SDXL is a heavier and more comprehensive model that now includes a "refiner" for adding details after image creation, resulting in higher-quality images. However, because the model is larger, rendering times are longer.
We already had our prompting recipe; we just needed to find the right checkpoint. After several attempts, we settled on "Juggernaut XL" which provided the most convincing results. Concurrently, we also discovered the LorA: "Juggernaut Cinematic XL" (linked to its checkpoint), which enabled us to produce higher-quality images.
Test with DreamShaper in Stable Diffusion 1.5 - Test with Juggernaut XL - Test with Juggernaut Cinematic XL
Challenges and Choices
In our experiments, we tried to get as close to photorealism as possible, and we quickly achieved that goal.
However, the real challenge was not creating photorealistic concept art but maintaining consistency across various environments, poses, clothing, and other aspects.
Concept Art for Space Cruiser
This is why we paid special attention to non-human species such as robots and aliens. Because integrating them into ComfyUI to make them "consistent" may pose challenges. Making them "consistent" means keeping the same face and body shape while changing clothes, poses, and environments. However, a detailed discussion on "consistent characters" will come later.
Examples of humanoid aliens
We conducted numerous experiments with costumes, drawing inspiration from existing pop culture. An interesting twist in our approach was the use of material prompts. Instead of using keywords like "suit" or "armor," we used words like "iron," "linen," "nylon," or "cotton," resulting in more intriguing costume variations.
Example of futuristic clothing using the same character prompt and ControlNet: OpenPose
2d to 3D
To take our project further, we decided to create a Unity matérial that would give a sense of depth. The idea is to use the Z-depth of the original image, apply it to a material, and link it to the device's gyroscope. This way, the character moves based on your viewing angle, creating a sense of depth and 3D.
The most critical factor is creating the best possible Z-Depth. Currently, the best Z-Depth we have generated automatically was created using a GitHub project called "Zoe Depth" which proved more effective than its counterpart, "Midas" integrated to automatic 1111. However, we are exploring other options to improve Z-Depth quality because it is what enables us to create depth. The higher the quality, the greater the camera rotation.
Once this feature is integrated into our engine, it can be used in the character "Description" menu, the "Gatcha" menu, and the "Battle" menu.
Example created on LeiaPix