US20260024538
2026-01-22
Physics
G10L19/038
The patent application introduces a sophisticated karaoke sound system designed to enhance audio quality in hands-free environments. It employs a combination of traditional frequency-domain Kalman filter (FDKF) techniques and advanced neural network methods. The system addresses the challenges of feedback suppression and vocal restoration, aiming to improve the clarity and quality of the audio output. This is particularly beneficial in complex acoustic environments where background noise and reverberation can compromise sound quality.
At the core of the system is a method that processes audio signals captured by a microphone. The audio signal is first input into an FDKF to manage acoustic feedback. The processed signal, along with the original audio, is then fed into a neural network designed to estimate and remove unwanted feedback. This neural network is further integrated with a codec to recover the vocal quality, ensuring the target vocal signal is enhanced and free from distortions.
The system features a neural network adaptive feedback cancellation (NNAFC) mechanism, utilizing a two-layer Long Short-Term Memory (LSTM) network. This setup effectively suppresses music and playback components in the audio signal. Additionally, a residual vector quantization (RVQ) codec module is employed to compress and reconstruct the vocal signal, enhancing the overall audio quality. The codec and neural network are trained together for optimal performance.
This technology is applicable in various settings, such as karaoke systems, video conferencing, and hearing aids, where audio clarity is crucial. By combining FDKF and neural networks, the system offers a robust solution for managing acoustic challenges like echo cancellation, howling suppression, and noise reduction. This hybrid approach ensures a superior user experience by maintaining clear and stable audio output.
The proposed karaoke sound system represents a significant advancement in audio processing technology. By integrating traditional and modern methods, it effectively addresses the limitations of existing systems. This innovation promises enhanced vocal quality and feedback suppression, making it a valuable addition to environments requiring high-quality audio performance.