US20250068841
2025-02-27
Physics
G06F40/274
Embodiments described involve decoding continuous language from non-invasive brain recordings. Techniques such as functional magnetic resonance imaging (fMRI) and functional near-infrared spectroscopy (fNIRS) are used to monitor changes in blood oxygen levels linked to neural activity. These recordings feed into a language reconstruction model that combines a neural language model, which predicts the next word in a sequence, with an encoding model that decodes the recordings into continuous language sequences.
Brain-computer interfaces (BCIs) provide a communication link between the brain and external devices, offering potential applications for translating thoughts into text. While invasive methods can decode continuous speech, they require risky neurosurgery. Non-invasive methods have been limited to decoding single words, posing challenges for fluid conversation. The invention aims to overcome these issues by using non-invasive techniques to decode continuous language.
The approach involves using non-invasive brain recordings to detect neural activity changes. The recordings are processed by a language reconstruction model, which includes a neural language model for predicting word sequences and an encoding model for decoding these sequences into coherent language. This method involves generating candidate word sequences, scoring their likelihood of evoking recorded brain responses, and selecting the best matches.
The encoding model is trained using supervised learning, where brain activity measurements are taken as subjects listen to spoken narratives. Semantic features of the stimuli are extracted and used to predict how these features influence brain responses. The model's accuracy is gauged by comparing predicted brain responses to actual recorded responses, effectively scoring the likelihood of word sequences.
To manage the vast number of possible word sequences, a beam search algorithm generates candidate sequences incrementally. This method maintains a set of likely candidates based on brain activity in speech areas, refining predictions over time. The system offers potential for both restorative and augmentative communication applications without the need for invasive procedures.