Invention Title:

METHOD AND APPARATUS WITH THREE-DIMENSIONAL OBJECT DETECTION

Publication number:

US20250356667

Publication date:

2025-11-20

Section:

Physics

Class:

G06V20/64

Inventors:

Dongwook LEE 🇰🇷 Suwon-si, South Korea

JIWON LEE 🇰🇷 Suwon-si, South Korea

Jinkyu Kim 🇰🇷 Seoul, South Korea

Sujin JANG 🇰🇷 Suwon-si, South Korea

Dae Hyun JI 🇰🇷 Suwon-si, South Korea

Sangpil KIM 🇰🇷 Seoul, South Korea

Gyusam CHANG 🇰🇷 Seoul, South Korea

Assignees:

SAMSUNG ELECTRONICS CO., LTD. 🇰🇷 Suwon-si, South Korea

KOREA UNIVERSITY RESEARCH AND BUSINESS FOUNDATION 🇰🇷 SEOUL, South Korea

Applicants:

SAMSUNG ELECTRONICS CO., LTD. 🇰🇷 Suwon-si, South Korea

KOREA UNIVERSITY RESEARCH AND BUSINESS FOUNDATION 🇰🇷 Seoul, South Korea

Smart overview of the Invention

The method and apparatus for three-dimensional (3D) object detection outlined here involve a series of steps using image processing techniques. Initially, two-dimensional (2D) image features are extracted from multiple images via an image backbone. These features are then transformed into a 3D feature map that includes depth prediction information through a view transformer, which is specifically designed for domain generalization.

Technique Details

The transformation to a bird's eye view (BEV) feature is accomplished using a BEV encoder. This BEV feature is crucial for predicting the position and class of the detected object through a detection head. The process involves DepthNet, which predicts depth outputs and inputs them along with 2D features into a BEV pool, enabling accurate 3D mapping.

Depth Normalization

A significant aspect of this method is the relative depth normalization technique, which reduces errors in depth and position prediction due to camera parameter differences. This involves calculating a transformation matrix for geometric transformations between camera pairs. The method further refines alignment between images using photometric matching based on depth predictions.

Domain Adaptation

The system incorporates domain adaptation adapters in components like the image backbone and view transformer. These adapters allow for fine-tuning parameters by performing operations such as skip connections, which help in updating gradients effectively. This adaptability ensures the system can generalize across various domains.

Additional Features

The method also includes augmenting the 3D feature map via decoupling-based image depth estimation, enhancing its robustness. The described system can be implemented in electronic devices with memory and processors capable of executing these instructions, ensuring versatile application in fields like autonomous vehicle navigation and robotics.