As a key research area of artificial intelligence, voice technology can not only bring a new and transformative way of human-machine interaction, but also has the opportunity to spawn emotional communication between humans and machines. Due to the importance of voice to humans, technology giants have once again turned their attention to the field of intelligent voice. Recently, the Intelligent Voice Intellectual Property Industry Alliance, jointly sponsored by Baidu, Haier, Jingdong, ZTE, China Putian, BAIC and BOE, and formed by more than twenty units, was officially established. The alliance guides and promotes the implementation and application of voice technology across industries by forming a patent pool, and opens the patent pool patents to alliance members for free.
This is another realization of Baidu's "engineer-style idealism" after the opening of voice capabilities two years ago. Based on its long-term accumulation in the core technology of intelligent voice, Baidu took the lead in putting the first batch of more than 100 voice technology patents that meet the criteria of the patent pool into the patent pool and openly licensing them to the alliance members.
From open voice capability to open technology patents, the original intention and end of sharing by Baidu is always to hope that technology can really solve people's problems and make life better. In the past two years, including Lenovo smart TV, Xiaomi cell phone, Xinli smart wearable devices, Tesla electric car and other manufacturers, as well as where to go, Stranger and other APPs have used Baidu voice development technology results. This free and open cooperation access provides enterprises with solutions to voice problems, greatly shortens the development cycle, and also reduces development costs to a certain extent. In addition, for many entrepreneurial small and medium developers, the openness of the technology provides more possibilities for their innovation.
Likewise, it has been proved that throwing peaches always returns plums, and it is because of the provision of Baidu platform technology and the free opening of core technology that more and more excellent developers are attracted. For voice technology, an important dependency is the "corpus". A large number of developers bring enough users to frequently use the voice function to accumulate the corpus, so that Baidu voice has the opportunity to be constantly trained, which in turn improves the accuracy, anti-noise rate, semantic recognition accuracy, and continuous improvement. It has also built an ecosystem for effective user acquisition.
At this year's Baidu World Congress, Baidu's chief scientist, Wu Enda, demonstrated Baidu's new generation of speech recognition technology live. The test showed that in a noisy environment, machine recognition has surpassed human. Baidu's speech recognition rate is close to 95%, making it the most advanced Chinese speech recognition technology in the world. A few days ago information shows that the recognition relative error rate of Baidu's Chinese quiet environment Mandarin speech recognition technology is more than 15% lower than the existing technology, and the recognition rate is close to 97%. And from 95%-99% is the process of quantitative change to qualitative change, which may completely change the way people interact with devices.
This breakthrough in voice technology is not unrelated to Baidu's open thinking and mode, including the establishment of a consortium to take out patents to share, are a more open-minded hope to share the achievements of voice technology, promote the progress of technology and voice industry development, and serve the mobile era. This is a simple path but has the clay to breed innovation and disruption.
The so-called disruptive technology means that there is a technology that can replace the original technology and make human needs better realized, and so is speech recognition. Before 2011, speech recognition technology mainly used hybrid Gaussian model. 2011 deep learning technology was introduced into the field of speech recognition and propelled the application of artificial intelligence technology in the whole industry into the era of deep learning.
With the development of Internet of Things and Internet of Vehicles, more and more devices will demand voice control, because voice recognition is a non-physical contact interaction, so that the machine has a real "sense of hearing", so that people can leave the keyboard. This will free users' hands and save their time and energy. Take Baidu News' voice broadcast as an example, it generates news summary with the help of natural language processing technology, which helps users improve the efficiency of information acquisition in today's serious information overload. Therefore, the future voice recognition function will become a necessary capability of every intelligent terminal to the same extent as computer chips. And realize the important entrance of intelligent life above human-computer communication.
Voice alliance reminds me of the past British ARM, it will open out the technology patent, break the technology monopoly, now more than 95% of the world's smart phones and tablets are using ARM's technology architecture, so that all companies can enjoy high performance, low cost, low energy consumption chip technology, so that the old chip manufacturer intel is challenged.
Similarly, voice alliance may be the next Chinese disruption. In the future, by sharing voice technology patents and achievements, it can develop deeply in artificial intelligence, mobile Internet, intelligent terminals, smart home, wearable devices and other fields. For example, it will be used for voice-controlled voice dialing system, information network query, medical service, banking service, etc. It will provide more upstream and downstream enterprises in the industry with more cutting-edge technologies, drive the development of intelligent voice and related industries, and form new industry and growth points.
Through the combination of open innovation and open intellectual property licensing, the Voice Alliance shares the achievements of voice technology and breeds a new platform for voice interaction that connects various smart hardware devices and provides various services. We can even imagine that in addition to technology, the future development of speech recognition, a communication method rich in human emotion, may not only be a function and application to help us solve problems, but also a bearer and embodiment of emotion and culture in scientific computing.