Speech recognition Microsoft recognizes words ‘as well as humans’

Spread the love

Microsoft’s speech recognition technology performs as well as a human listening to a conversation between two strangers. The company claims to be the first to reduce transcription errors to 5.9 percent.

According to Microsoft, the technology recognizes words ‘equivalent to humans’. The Microsoft team has been heading towards this point for some time: last month Microsoft reported that it was at an error rate of 6.3 percent, which made the company the leader in this area at the time. In another test, in which two people who know each other can talk about whatever they want, Microsoft achieved an error rate of 11.1 percent, which is 0.2 percentage points lower than a competing human professional. The findings are described in detail in a paper.

The researchers say they have reached the milestone with the help of deep neural networks, which are trained with the help of humans to constantly improve. The use of GPUs for artificial intelligence would also have contributed to the speed with which the results were achieved, Microsoft says.

The company also says it’s working to address the variables that come into play in real life: background noise and accents, for example. The next step for Microsoft is to sharpen understanding in addition to recognition, which is just as important for the effectiveness of, for example, the digital assistant Cortana.

The IBM Watson team reported an error rate of 6.9 percent in the same test in April this year. It is unclear where IBM is at the moment. On October 4, Google, which has just released its new Assistant, also announced that it was approaching human parity.

You might also like