Invention Title:

METHOD AND ELECTRONIC DEVICE FOR TRAINING NEURAL NETWORK MODEL BY AUGMENTING IMAGE REPRESENTING OBJECT CAPTURED BY MULTIPLE CAMERAS

Publication number:

US20240135686

Publication date:
Section:

Physics

Class:

G06V10/774

Inventor:

Assignee:

Applicant:

Drawings (4 of 25)

Smart overview of the Invention

A method for training a neural network model is proposed, focusing on augmenting images of objects captured by multiple cameras. The process begins by obtaining an object recognition result from a first neural network model, which uses an image taken from a specific viewpoint. This initial recognition result is then converted based on the coordinate systems of two different cameras capturing the same space from distinct angles.

Data Generation Process

The method includes generating training data by labeling a second image that corresponds to the first image. This second image is captured by the second camera, and the labeling process relies on the converted object recognition result that reflects the perspective of the second camera. This step is crucial for creating accurate training data that can improve the performance of a neural network model.

Training of Neural Network Models

Once the training data is generated, it is used to train a second neural network model. This training utilizes the augmented data derived from both camera inputs, allowing for a more robust learning process that can enhance object recognition capabilities across varying viewpoints.

Electronic Device Configuration

The electronic device designed for this method comprises several components, including a camera, communication unit, memory for storing instructions, and a processor. The processor executes instructions to obtain object recognition results, convert these results between different camera coordinate systems, and generate training data for further model training.

Cloud Server Implementation

A cloud server can also implement this method, featuring similar components as the electronic device. It processes object recognition results and images captured from different viewpoints through its communication unit. The server facilitates data conversion and training data generation, thereby supporting the training of neural network models in a cloud-based environment.