Gmail’s Inbox goes ‘smart respond’ to email
Google’s Gmail app Inbox will offer reply options as a ‘Smart Reply’ feature. Smart Reply suggests three replies based on the emails the user receives. For emails that just need a quick reply, Inbox ‘thinks up’ three different, quick replies.
Initially, the feature will only be available in English. Users who have their language set to English can expect the update “later this week,” the Gmail team writes on its blog. The system gets ‘smarter’ the more it is used. If all is well, the answers are ‘sfw’ or suitable for the workplace.
To get to the answers, Google set up a neural network that serves as the basis for the Smart Reply feature. In an extensive blog post on Google Research, researcher Greg Corrado describes how the research to arrive at the automatic response function was conducted. The smart reply system is built on two recurrent neural networks to encrypt incoming email and one to predict possible replies. The encoding network ‘reads’ the words of the incoming email word by word. From that it creates a vector or list of numbers. That vector should capture the essence of what is being said, without getting stuck on language use or choice of words. As an example, Corrado gives that ‘Are you free tomorrow?’ should be similar for the vector as ‘Does tomorrow work for you?’. The second network starts from that thought vector and makes two grammatically correct answers, again word for word. The amazing thing, according to Corrado, is that “the whole act of the network is completely learned, just by training the model to predict possible answers.”
One of the biggest challenges is that an email is often hundreds of words long. This is where a special type of neural networks comes into play, a so-called ‘long short-term memory’ network or lstm network. These kinds of networks retain information for a long time, which is what recurrent neural networks, or rnns do, but in practice they can link better information that is further apart. Because LSTMs can do that better, this kind of network knows how to come up with a meaningful answer from the relevant sentences, without being distracted by intermediate information.
The system’s first prototype had some amazing reactions and other weird quirks. For example, generating candidate answers led to three similar answers that were very close to each other, such as ‘shall we meet tomorrow’, ‘shall we meet tomorrow’ and ‘how about you tomorrow?’ Then a natural language mapping system was added, making responses more diverse. But the system did more crazy things, like defaulting to suggest ‘I love you’ as an answer, which isn’t very useful as an answer in most cases.
Of course, Corrado also points out the security of the system, that privacy is guaranteed and that no real people are reading it. Something that immediately causes problems for researchers, because they work with datasets that they cannot read themselves. Something like ‘solving a puzzle while blindfolded’.