Invention Title:

IMAGE COMPRESSION AUGMENTED WITH A LEARNING-BASED SUPER RESOLUTION MODEL

Publication number:

US20240144425

Publication date:

2024-05-02

Section:

Physics

Class:

G06T3/4053

Inventors:

Jinjun Xiong Goldens Bridge, NY, United States

Vikram SHARMA MAILTHODY URBANA, IL, United States

Nicholas CHEN Chicago, IL, United States

James WEI San Francisco, CA, United States

Applicants:

INTERNATIONAL BUSINESS MACHINES CORPORATION Armonk, NY, United States

THE BOARD OF TRUSTEES OF THE UNIVERSITY OF ILLINOIS Urbana, IL, United States

Drawings (4 of 11)

Drawing 01 for IMAGE COMPRESSION AUGMENTED WITH A LEARNING-BASED SUPER RESOLUTION MODEL

Drawing 02 for IMAGE COMPRESSION AUGMENTED WITH A LEARNING-BASED SUPER RESOLUTION MODEL

Drawing 03 for IMAGE COMPRESSION AUGMENTED WITH A LEARNING-BASED SUPER RESOLUTION MODEL

Drawing 04 for IMAGE COMPRESSION AUGMENTED WITH A LEARNING-BASED SUPER RESOLUTION MODEL

Smart overview of the Invention

The patent application introduces techniques for enhancing image compression using machine learning (ML), specifically through a super-resolution model. Initially, encoded image data is obtained by encoding digital images with an encoder. These images are then reconstructed by decoding the encoded data using a corresponding decoder. Subsequently, a super-resolution ML model is applied to transform the first set of reconstructed images into a second set with higher resolution. This model is trained to enhance the image resolution based on the characteristics of at least one of these higher-resolution images.

Background and Context

The invention addresses the growing demand for high-quality image and video data, fueled by applications like social media and video streaming. As devices like IoT and mobile gadgets become more prevalent, there is an increasing need for efficient image storage and transmission due to limited battery life, computational power, and storage capabilities. Traditional compression methods like JPEG are fast but often result in lower quality images compared to ML-based techniques, which are slower yet more efficient in terms of quality.

Challenges with Existing Techniques

While traditional methods are quick, they suffer from higher distortion and lower perceptual quality at similar bitrates compared to ML-based approaches. These traditional techniques require extensive manual design, increasing deployment costs. On the other hand, ML-based methods offer superior quality but are slower and less practical for real-time applications. The need to store multiple resolutions for diverse devices further escalates computation and storage costs.

Innovative Solution

The proposed approach integrates deep learning-based super-resolution into the compression process, enhancing throughput and reducing file sizes while maintaining high perceptual quality at comparable bitrates. This method allows flexible production of target images at any desired resolution, catering to devices with varying display capabilities. It significantly improves processing speed and reduces storage requirements by only storing a single source image resolution.

Technical Implementation

The system includes a compression block that sequentially processes digital images through various layers. Initially, an input image undergoes transformation to produce a lower resolution version via methods like bicubic interpolation or learned transformations using deep learning models. This lower resolution image is then encoded into a latent representation that retains essential visual information using a suitable encoder, potentially a deep learning ML encoder. This process ultimately facilitates efficient end-to-end compression with enhanced image quality.