Invention Title:

END-TO-END INSTANCE-SEPARABLE SEMANTIC-IMAGE JOINT CODEC SYSTEM AND METHOD

Publication number:

US20240171761

Publication date:

2024-05-23

Section:

Electricity

Class:

H04N19/42

Inventors:

WEIYAO LIN Shanghai, China

Shizhan LIU Shanghai, China

Assignee:

Shanghai Jiao Tong University Shanghai, China

Applicant:

SHANGHAI JIAO TONG UNIVERSITY Shanghai, China

Drawings (4 of 26)

Drawing 01 for END-TO-END INSTANCE-SEPARABLE SEMANTIC-IMAGE JOINT CODEC SYSTEM AND METHOD

Drawing 02 for END-TO-END INSTANCE-SEPARABLE SEMANTIC-IMAGE JOINT CODEC SYSTEM AND METHOD

Drawing 03 for END-TO-END INSTANCE-SEPARABLE SEMANTIC-IMAGE JOINT CODEC SYSTEM AND METHOD

Drawing 04 for END-TO-END INSTANCE-SEPARABLE SEMANTIC-IMAGE JOINT CODEC SYSTEM AND METHOD

Smart overview of the Invention

The patent describes an advanced system for compressing images alongside their semantic data, such as instance masks, bounding boxes, categories, and relationships. It leverages the redundancy between images and their semantics to enhance rate-distortion performance, allowing specific instances within an image to be encoded separately. This approach supports efficient retrieval of both the semantics and image of a specific object, reducing computational and transmission costs.

Technical Field

The technology pertains to image compression, focusing on a joint codec system that enables separate encoding and decoding of specific instances in an image. Unlike traditional methods that process images and semantics independently, this system integrates them to exploit redundancy, improving compression efficiency and performance.

Background Technology

Existing compression techniques often fail to utilize the redundancy between images and their semantics, leading to inefficiencies. These methods require full decoding of all information to retrieve specific object parts, resulting in unnecessary computation and transmission costs. The proposed method addresses these issues by supporting instance-separable codecs, optimizing resources in applications like real-time special effects and virtual group photos.

System Components

Key components of the system include image encoder and decoder, mask encoder and decoder, union encoder, embedding extraction module, semantic graph encoder and decoder, entropy coding module, union entropy model, image entropy model, and mask entropy model. The process involves encoding instance images and masks using neural networks for downsampling and dimensionality reduction. These are then combined into a semantic graph for efficient compression.

Encoding and Decoding Process

Step 1: Extract semantics from the input image to separate it into instance images. Use encoders to obtain representations, which are combined into union representations.
Step 2: Encode the semantic graph and representations into bitstreams using entropy models for probability distribution estimation.
Decoding: Decode the semantic graph bitstream first. Then use entropy models to decode union, mask, and image bitstreams successively.