US20240171761
2024-05-23
Electricity
H04N19/42
The patent describes an advanced system for compressing images alongside their semantic data, such as instance masks, bounding boxes, categories, and relationships. It leverages the redundancy between images and their semantics to enhance rate-distortion performance, allowing specific instances within an image to be encoded separately. This approach supports efficient retrieval of both the semantics and image of a specific object, reducing computational and transmission costs.
The technology pertains to image compression, focusing on a joint codec system that enables separate encoding and decoding of specific instances in an image. Unlike traditional methods that process images and semantics independently, this system integrates them to exploit redundancy, improving compression efficiency and performance.
Existing compression techniques often fail to utilize the redundancy between images and their semantics, leading to inefficiencies. These methods require full decoding of all information to retrieve specific object parts, resulting in unnecessary computation and transmission costs. The proposed method addresses these issues by supporting instance-separable codecs, optimizing resources in applications like real-time special effects and virtual group photos.
Key components of the system include image encoder and decoder, mask encoder and decoder, union encoder, embedding extraction module, semantic graph encoder and decoder, entropy coding module, union entropy model, image entropy model, and mask entropy model. The process involves encoding instance images and masks using neural networks for downsampling and dimensionality reduction. These are then combined into a semantic graph for efficient compression.