Invention Title:

END-TO-END INSTANCE-SEPARABLE SEMANTIC-IMAGE JOINT CODEC SYSTEM AND METHOD

Publication number:

US20240171761

Publication date:
Section:

Electricity

Class:

H04N19/42

Inventors:

Assignee:

Applicant:

Drawings (4 of 26)

Smart overview of the Invention

An end-to-end instance-separable semantic-image joint compression system is designed to improve the efficiency of image compression by utilizing the redundancy between images and their associated semantics. This system includes various components such as image and mask encoders, a union encoder, an embedding extraction module, and entropy coding modules. By allowing independent coding of instances, the system enhances the rate-distortion performance while enabling efficient retrieval of specific object semantics and images during decoding.

Technical Innovations

The proposed method addresses the limitations of existing technologies that compress images and their semantics separately, which often leads to significant redundancy. By jointly compressing images with their semantics—such as instance masks, bounding boxes, and relationships—the system reduces computational costs and improves overall compression ratios. This is particularly beneficial for applications like real-time special effects in movies or virtual group photos, where only specific parts of an image are needed.

System Components

The joint codec system consists of several key components:

  • Image Encoder and Decoder: Processes instance images to generate representations.
  • Mask Encoder and Decoder: Handles instance masks to create mask representations.
  • Union Encoder: Combines image and mask representations into union representations.
  • Embedding Extraction Module: Generates union embeddings from union representations.
  • Semantic Graph Encoder and Decoder: Compresses and recovers the semantic graph.
  • Entropy Coding Modules: Encodes the various representations into bitstreams based on probability distributions.

Encoding Process

The encoding process involves several steps starting with the extraction of semantics from input images. The image and mask representations are created through respective encoders, which are then combined in the union encoder. The resulting union representation is quantized into a union embedding that is embedded into a semantic graph. Each component's bitstream is generated through entropy coding based on probability distributions estimated by various entropy models, ensuring efficient compression.

Decoding Methodology

The decoding process begins with the recovery of the semantic graph from its bitstream. Following this, the union embedding is decoded to retrieve the quantized union representation, which then allows for the recovery of mask representations. Finally, using all decoded information, the image representation is reconstructed. This systematic approach ensures that specific instances can be accessed without needing to decode all information, thereby optimizing computational resources and reducing transmission costs.