Invention Title:

ELECTRONIC DEVICE, TERMINAL, AND OPERATING METHOD WITH NEURAL NETWORK LIGHTWEIGHTING

Publication number:

US20260093989

Publication date:
Section:

Physics

Class:

G06N3/082

Inventors:

Assignees:

Applicants:

Smart overview of the Invention

An electronic device is designed to optimize neural networks by reducing complexity while maintaining performance. It achieves this by generating multiple candidate neural networks where certain nonlinear layers are excluded. Each segment of these networks includes a nonlinear layer and a convolution layer. The device selects the most efficient network based on importance and latency values, merging successive convolution layers to form a final, streamlined neural network.

Background

Neural networks are crucial in machine learning, used for tasks like image and voice recognition. They consist of multiple layers and nodes, improving performance with increased depth. However, deeper networks require more computational resources and time. This invention addresses the challenge by lightweighting neural networks, aiming to balance depth and efficiency.

Candidate Selection

The device evaluates candidate networks by comparing kernel sizes of convolution layers within a succession segment. It identifies a representative layer and selects a network based on a combination of latency and importance values. The goal is to ensure the sum of latency values is below a threshold while maximizing importance values, thus optimizing performance and efficiency.

Layer Merging

When multiple convolution layers are identified, they are merged into a single representative layer. If only one layer is found, it becomes the representative layer. In cases where kernel sizes are equal, the layer with the largest weight sum is selected, and others are excluded, ensuring the network remains efficient without unnecessary complexity.

Final Neural Network

The final neural network is a target lightweight version, optimized for performance and latency. It undergoes retraining to adjust weight values, ensuring it meets specific profile requirements. This method allows the device to generate inferential data efficiently, making it suitable for various applications requiring rapid and reliable neural network processing.