Invention Title:

BILINGUAL MULTITASK MACHINE TRANSLATION MODEL FOR LIVE TRANSLATION ON ARTIFICIAL REALITY DEVICES

Publication number:

US20250103831

Publication date:
Section:

Physics

Class:

G06F40/58

Inventors:

Applicant:

Smart overview of the Invention

The patent application discusses a machine translation model integrated into head-mounted displays for live translation. These displays can recognize text using optical character recognition or automatic speech recognition and translate it from one language to another. The model is designed to learn multiple versions of the source text through various modification tasks, enhancing its translation accuracy. The target text is a properly translated and formatted version derived from these variations.

Technical Field

The application focuses on head-mounted displays equipped with a display and a machine translation model. This technology is intended to facilitate real-time translation, allowing users to view translated text directly on the display. This can be particularly useful for travelers in foreign countries where they do not speak the local language.

Functionality

The head-mounted display incorporates an image sensor, memory-stored machine translation model, and processors to execute commands. These commands involve capturing images with text in a first language, formatting this text, and translating it into a second language. The translated content is then displayed as an image or video on the device.

Method and Medium

The method involves obtaining source text in a first language, converting it into multiple word sequences based on different tasks, and generating a target text in a second language. This process includes mapping word sequences to the target text to ensure accurate translation. A non-transitory computer-readable medium can store instructions for performing these tasks.

Applications and Advantages

The described technology offers significant advantages by providing seamless multilingual communication via augmented reality devices. It allows users to interact more effectively in diverse linguistic environments, enhancing experiences in virtual spaces such as the Metaverse. The system's flexibility and adaptability make it suitable for various applications, including socializing, learning, and shopping in virtual or augmented realities.