Invention Title:

HOTWORD DETECTION ON MULTIPLE DEVICES

Publication number:

US20240169992

Publication date:
Section:

Physics

Class:

G10L15/285

Inventor:

Assignee:

Applicant:

Drawings (4 of 4)

Smart overview of the Invention

The patent application outlines a system for hotword detection across multiple devices. It describes a method where a first computing device receives audio data from an utterance and calculates a likelihood score indicating the presence of a hotword. A second computing device also calculates a similar likelihood score. These scores are compared to determine which device should proceed with speech recognition processing, ensuring only one device responds to the hotword.

Technical Context

The technology operates within speech-enabled environments, such as homes or workplaces, where users interact with systems through voice input. Such environments use networks of microphones that capture utterances from users, allowing them to issue commands or queries from anywhere without needing proximity to a device. The system uses hotwords, like "OK computer," to activate devices for processing subsequent commands or questions.

Innovative Aspects

A key innovation involves multiple devices calculating and sharing hotword confidence scores when they detect an utterance. The device with the highest score takes precedence in processing audio data, while others remain inactive. This method prevents multiple devices from responding simultaneously, which could lead to confusion and inefficiency in user-device interactions.

Methodology

  • Audio data is received by a first computing device.
  • A likelihood value for the presence of a hotword is determined and compared with values from other devices.
  • The device with the highest likelihood value processes the audio data.
  • Inactive devices refrain from further processing unless they receive a higher likelihood value.

Advantages and Implementation

This system ensures efficient hotword detection by activating only the most appropriate device, reducing unnecessary responses from other devices. This approach is beneficial in environments where multiple devices are present, enhancing user experience by minimizing unintended activations and ensuring the user's intended device responds accurately.