US20240304174
2024-09-12
Physics
G10L13/027
Methods and systems have been developed to enhance video game experiences by capturing and utilizing ambient noise from a player's environment. This involves identifying background voices distinct from the player's voice, isolating phonemes within those voices, and using generative artificial intelligence (AI) to create synthesized speech for in-game characters. The synthesized speech can be tailored based on various attributes like age and gender, allowing for a more immersive experience.
The system operates by receiving audio input through a microphone associated with a gaming device. It distinguishes between the user's voice and background voices by analyzing sound characteristics such as frequency patterns and cadence. Once a background voice is identified, the system isolates its phonemes, ensuring enough data exists to synthesize coherent speech. This synthesized speech can be integrated into gameplay, enhancing character interactions with dynamic dialogue options.
Players have the ability to customize their gaming experience further by selecting different background voices for characters. The system allows for the generation of speech based on both static scripts and real-time gameplay changes. Additionally, players can modify the synthesized speech through filtering options that adjust attributes like pitch or gender, making the character's voice fit the desired persona more closely.
The method includes deriving a unique "voice fingerprint" for each identified background voice. This fingerprint captures essential features such as timbre and sharpness of speech, which are vital for accurate reproduction. The system continuously samples audio during gameplay or at specific intervals to ensure a comprehensive understanding of the environment's soundscape and to refine the voice fingerprint as needed.
The technology aims to create a more engaging gaming environment by integrating real-world sounds into virtual interactions. By saving voice fingerprints and their parameters, the system can recall distinct background voices for future use, allowing for seamless integration in ongoing gameplay. This approach not only enriches player immersion but also opens avenues for further innovations in audio-visual experiences in gaming.