Invention Title:

METHODS AND SYSTEMS FOR RENDERING OBJECT BASED AUDIO

Publication number:

US20250356861

Publication date:

2025-11-20

Section:

Physics

Class:

G10L19/008

Inventors:

Sripal S. Mehta 🇺🇸 San Francisco, CA, United States

Prinyar Saungsomboon 🇬🇧 Reading, United Kingdom

Thomas ZIEGLER 🇩🇪 Nuremberg, Germany

Giles BAKER 🇺🇸 San Francisco, CA, United States

Jeffrey RIEDMILLER 🇺🇸 Novato, CA, United States

Assignees:

DOLBY LABORATORIES LICENSING CORPORATION 🇺🇸 SAN FRANCISCO, CA, United States

DOLBY INTERNATIONAL AB 🇮🇪 DUBLIN, Ireland

Applicants:

DOLBY LABORATORIES LICENSING CORPORATION 🇺🇸 San Francisco, CA, United States

DOLBY INTERNATIONAL AB 🇮🇪 Dublin, Ireland

Smart overview of the Invention

The patent application discusses methods and systems for generating and rendering object-based audio programs that are customizable and provide an immersive audio experience. The audio program includes a bed of speaker channels, which can render a default full-range audio experience even without additional content selection. The system allows for user-selectable and configurable object channels, enhancing personalization in audio rendering.

Technical Field

The invention pertains to audio signal processing, specifically the encoding, decoding, and interactive rendering of audio data bitstreams. These bitstreams include both audio content and metadata that supports interactive rendering. The invention is compatible with formats like Dolby Digital (AC-3), Dolby Digital Plus (E-AC-3), and Dolby E, although it is not limited to these formats.

Background

Dolby Laboratories provides proprietary implementations of AC-3 and E-AC-3 formats. A typical audio data stream includes both compressed audio content and metadata indicating characteristics of the audio content. The metadata parameters, such as DIALNORM, are designed to alter the sound delivered to a listening environment. The bitstreams can include multiple channels of audio content compressed using perceptual audio coding.

Audio Bitstream Structure

An AC-3 or E-AC-3 encoded bitstream is divided into sections including synchronization information, bitstream information (BSI) which contains most of the metadata, audio blocks containing compressed audio data, waste bits for unused data, auxiliary sections for additional metadata, and error correction words. Metadata parameters allow for sound customization in the listening environment.

User Interactivity

The system enables a high level of user interactivity by allowing users to select a mix of audio content for rendering. Users can choose among various rendering options provided by metadata, such as selecting specific object channels or adjusting playback levels of certain sound sources. Metadata also allows users to choose spatial locations for sound source rendering or select from a menu of predefined rendering options.