Smart Home Voice Control Solution: Let the furniture understand

While smart homes bring convenience to people, people's control habits gradually change. The development of voice technology has also supplemented the control portal. After throwing away the traditional remote control and mobile APP, it will make the home environment comfortable by password transmission, making life more convenient and intelligent. Will this be the next smart home industry? The universal application of the stage?

Far-field speech recognition, cloud semantic recognition, artificial intelligence applications and other technical nodes have made new breakthroughs, providing a new control entry option for smart homes. This article will present the application prospects of voice control technology in smart home products and integration projects from various perspectives such as technology trends, solutions, product applications, and project implementation.

The so-called intelligent voice industry mainly refers to the industry that provides various services to users through voice synthesis technology and voice recognition technology. Generally speaking, the user only needs to speak the service terminal to issue a command, and then the corresponding service can be obtained. This industry has emerged since the 1960s, but it is not well known to the average consumer, and consumers are less aware of it. In recent years, with Apple, Google, Microsoft and other companies have launched smart voice services such as Siri, this service and related industries have begun to be concerned by the general consumer and investment community.

Voice control technology

Communicate with the machine and let the machine understand what you are saying. This is what people have long dreamed of. Speech recognition technology is a technique that allows a machine to transform a speech signal into a corresponding text or command through an identification and understanding process.

Speech recognition is an interdisciplinary subject. The combination of speech recognition technology and speech synthesis technology enables people to get rid of the keyboard and operate through voice commands. The application of speech technology has become a competitive emerging high-tech industry.

Current problems with voice control technology

At present, the intelligent hardware products for voice control are deeply criticized in many scenarios due to the unsatisfactory experience of voice interaction. The reasons are mainly limited by spatial distance, background noise, other vocal interference, echo, reverberation, etc. Complex factors, resulting in a clear pain point such as near recognition distance and low recognition rate.

In addition, there are quite a lot of Chinese language, dialects and accents, and the multi-semantic nature of Chinese, so people in different regions use voice control to identify a large difference. At the same time, in the semantic recognition, there is also a problem that the contextual association brings difficulty in learning, difficulty in positioning, and difficulty in establishing a model.

Several control techniques for speech recognition

The "speech recognition" technology is equivalent to installing an "ear" on a computer system to make it "listenable". The technology undergoes complex steps such as speech signal processing, speech feature processing, model training and decoding engine, so that the machine can finally identify the content, speaker, language and other information in the speech. The implementation of the voice control function is highly correlated with the user's usage habits. The current implementation of voice control functions can be divided into two categories: near-field speech recognition and far-field speech recognition.

Near field/far field speech recognition technology

Near-field speech recognition requires the user to click to start, and the distance between the user and the terminal device is relatively close. For example, a mobile phone or other terminal device can directly realize the control function by using the terminal device.

Far-field speech recognition, in which voice data picked up by a microphone array is used as input data, and a voice recognition algorithm is used to convert a voice signal into a text. Although the principle of the near-field speech recognition technology is the same, since the spatial distance between the sound source and the microphone increases, signal intensity attenuation and various noise interferences occur during the sound wave propagation process, so special speech data is required. Pick and pre-processing techniques. Different pick-up devices and pre-processing techniques often change the characteristics of the acoustic signal used for speech recognition. Therefore, for different far-field speech pickup technologies, the speech recognition engine needs to be customized and optimized.

When the speech signal is attenuated during the propagation process, affecting the intensity and resolution of the acquired signal, the directional microphone with very high sensitivity is used, and the parameters of the microphone are adjusted to the mode suitable for far-field speech data, which can capture the maximum clarity. Far field speech signal. The voice command sound wave is polluted by surrounding noise during transmission, reduces the signal-to-noise ratio of the sound wave signal, and uses directional wave velocity forming technology to suppress the noise outside the direction, thereby reducing the noise interference to the voice signal. In a room, the sound waves picked up by the microphone are not only directly from the sound source, but also the late sound waves reflected by the wall after the sound source is emitted, resulting in residual sound, causing reverberation. The data collected by multiple microphones is used to separate the sound data reached at different times by the multi-channel echo cancellation algorithm, thereby eliminating the influence of reverberation on the sound data.

Wake-up target detection technology

When manipulating with voice at a distance, the sound may come from different people in different directions. So first of all, we must determine which are the sounds of the instructions and which are not. The microphone array wave velocity shaping algorithm used divides the 360-degree space vertically into several regions, and each microphone is responsible for detecting a specified region. When an awakening word is detected in a certain spatial area, the microphone pickup function corresponding to the spatial area is enhanced, and the microphone pickup of other areas is suppressed. Thereby, the direction of the sound is picked up in a direction, and the influence of the voices in the surrounding televisions and the conversations of other people on the voice commands is avoided.

Play state interrupt technique

When performing voice control on a device such as a speaker, the device is often in a state of playing a song. Since the microphone is mounted on the speaker, the distance between the microphone and the speaker is much larger than the distance between the microphone and the speaker. In this case, the internal and external methods are used to solve the problem. The internal use of a special echo cancellation algorithm internally reduces the effect of noise on the microphone. In addition, the traditional linear echo cancellation method fails for the nonlinear interference caused by the vibration, so the nonlinear echo cancellation algorithm can be used to improve the internal noise elimination effect. In the external structural design, the carefully designed microphone array damping structure minimizes the vibration between multiple microphones and the circuit board to which it is connected, thereby maximally controlling the vibration of the speaker body caused by high sound intensity. The interference of the pickup.

Lithium Ion Battery Cell

Lithium Ion Battery Cell,3.2V150Ah Lifepo4 Cell,Prismatic Lithium Iron Battery,Lithium Iron Phosphate Battery Lifepo4

Jiangsu Stark New Energy Co.,Ltd , https://www.stark-newenergy.com