- Nov 17, 2023
- 3 min read

GenAI for Sci-fi Dialogs

In July 2023, our development team embarked on a specific mission: to generate HD characters for an upcoming sci-fi game. We aimed to measure the distinctiveness of these characters, assess their integration with game dialogues, and evaluate the latest text-to-speech and lip-sync tools available.

Key Takeaways:

Creating a mockup from scratch was impressively quick. It took just 2-3 days for asset creation and implementation, including the research.
Adding voices and lip-sync to dialogue significantly enhances character development.
Voices, texts, and visual assets can be directly added to the game engine. However, animations and lip-sync require a different approach.

From Concept to Animated Mockup

Aesthetic Design

We pursued an action-centric style for our game, suitable for humans, aliens, and robots. After surveying various models and art styles, we decided on a neo-medieval appearance with gleaming armors. We generated hundreds of images in batches using wildcards on 3-4 more or less photorealistic models. Post-generation, we applied a latent upscale of x2 and a denoising strength of 0.7 to produce integration-ready assets.

We looked at varying levels of realism and artstyle reference to get more or less details on the armors.

Resizing from 512x712 to 1024x1424 is essential to obtain sufficiently large assets.

Character Posing

For our engine integration, characters should interact with each other, speaking towards the players, reminiscent of a theater performance. To maintain consistent posing, we utilized the controlnet depth_midas model for a few initial frames of the generation, ensuring flexibility in the final shape.

Finding character variables

Our approach involved using a set of keywords specifying body types, alien animal species, robot designs, skin color, outfit colors, and more. These keywords are cataloged in a file and accessed via the dynamic prompt feature of SD Webui's wildcards. We then produced a random batch of these variables, offering a plethora of characters to choose from. The team subsequently reviews these characters to determine the visual characteristics essential for a diverse character range.

Although around 70% of the batches are usable, we still emphasize the importance of careful selection.

Creating Remaining Assets for Mockup

We quickly generated a background image depicting space, combining perspective depth and some hangar footage from movies. Our past experience with background generation confirmed our ability to craft various backdrops with ease.

Generating Dialogs

For the dialogues, our goal was to achieve a captivating Hollywood tone, instilling a sense of urgency reminiscent of a battle scenario. For the final game, we employ a narrative design document detailing each dialogue, considering style, character evolution, universe expansion, and storyline, all powered by the OpenAI GPT API.

Captain Markus: "Chief O'Riley, the Zephrons won't love our little shortcut, but it's our only shot." Chief O'Riley: "Great, just what I need. Diplomatic backlash to go with my morning coffee." Captain Markus: "Pack some donuts too. We're going, Chief." Chief O'Riley: "All right, Captain Space Bulldozer. If we start an intergalactic war, I'm blaming you." Captain Markus: "Deal. You can tell 'em at my court-martial." Chief O'Riley: "Good, I've been working on my opening statement."

Animation Mockup

We used eleven labs (link to website) to generate the dialogs, then we fed them to D-ID (link to generate) to create the animation on the face of the characters. We did a quick a integration in After Effect to create a proof of concept for the dialogs we would like to have in the game.

This method cannot be implemented in the game as it is, but we could generate voice over and use other type of lipsync methods in Unity to bring more life to our characters during dialogs. At least it can be used for marketing material.