Intro to Conversational AI
Updated: Feb 23
Ask anyone from your neighbour to your doctor–they’ll likely all agree that artificial intelligence, also known as AI, is currently changing the world. AI is quickly percolating through nearly every field, as machines become increasingly more intelligent and humans become increasingly more inventive with multiple applications of this technology.
At the same time, the way humans communicate is changing. Face-to-face communication is being replaced with a virtual, online world. This bids the creation of a new type of AI, at the intercept between the move towards intelligent machines and the change in human conversation, resulting in what is known as conversational AI.
Conversational AI is a technology that creates human-like interactions between computers and people. Conversational AI enables computers to understand the words, meaning, and intention behind what a human is asking, and produce a reply that’s similar to how a human would speak –short, casual, and contextual.
You’ve likely interacted with conversational AI when using an online chatbot, as this technology is becoming popular across sectors such as banking, retail, and travel. You have definitely used conversational AI if you’ve ever used a virtual personal assistant such as Google Assistant, Alexa, or Siri. Perhaps the first time you spoke to your assistant, it felt like magic. But what is really happening between asking your question to Alexa, and the generation of her reply?
Demystifying the Magic of the Chatbot
To explain, let’s use an example.
You want to know if you have time to grab a coffee from the cafe across the street. To figure this out, you may ask your chatbot: “What time does the cafe close?”
But you could also ask: “Is the cafe still open?”
Or: “Has the cafe closed?”
Each of these question variations is known as an utterance. The utterance is the question you ask the bot. The information you’re hoping to get, or your intention of the question, is known as the intent. Multiple utterances can have the same intent. Though there’s many ways to ask the question, your intent here is to find out the cafe hours of operation. Within the utterance, you have keywords that are known as entities. The bot identifies entities as the place, people, actions and/ or objects involved in the question. The entities here are the place (cafe) and time (close/open/closed). The combined recognition of the entities, user intent, and utterance enables the conversational AI bot to give you the correct reply.
The ability of a chabot to actually capture, understand, and process the entities and intents is known as “natural language processing (NLP)” and “natural language understanding (NLU).” NLP processes your question, and breaks it down into entities (keywords). NLP understands what you directly write or say, and converts the information into a format that the computer can understand In a voice bot setting, NLP is also essential for speech recognition.
But NLP only partly gets you the answer to your question. NLU is required for the bot to understand what you actually meant; it uses the data from NLP to identify the intent. NLU interprets and understands your question, through consideration of the word choice, context, sentiment, and intent of your request.
An example of a system that doesn’t use NLU is a search engine–a search engine recognizes keywords in your query, but doesn’t recognize your intention. For example, a search engine may not recognize that when you ask “what’s it like outside,” you are wondering what the weather is like outside, because you do not actually use the keyword “weather.” But the combination of NLP and NLU allows the bot to unravel the multitude of meanings behind your seemingly simple question. The bot can then understand that what you actually meant was “what is the temperature outside” (and maybe even, “should I wear a jacket”) and can give you that response.
“NLP understands what you directly write or say.
NLU understands what you actually meant.”
Skirting the System - AKA Button-Based Bots
You may be wondering how a computer can understand the hundreds of ways a human can ask the same question. In fact, this is a key challenge in many conversational AI systems. Conversational AI bots that use NLU and NLP require teaching and training; similar to how you would teach a child the rules of language, we also need to teach the bot which entities to look for, how to recognize utterances, and what a human is really intending from that sentence. This requires extensive time, talent, and knowledge of the industry jargon.
To get around these demands, many current text chatbots employ a button-based method. Instead of being given the opportunity to ask your bot any question that comes to mind, you are given select options via buttons. This methodology simplifies chatbot learning and removes the need for the bot to understand intent, making it possible to develop chatbots quickly. See the example below, where the user is given two options via the blue buttons:
Though button-based chats are technologically simpler to develop, buttons prevent the user from asking the infinite number of questions that a human can come up with. They prevent a natural dialogue, thereby preventing the occurrence of the genuinely human-like conversation we are striving for.
The Better Option: Open Text
The opposite situation to the button-based system is an open text chatbot. If this was an open text bot, the user could type “Yes, please!”, similar to the button option. The bot would understand, and give the user the score of the game. However, the interaction could also go as follows:
The bot could get confused and the user wouldn’t find out the basketball score. However, a particularly intelligent chatbot could recognize that the user wants a variety of answers. The user then gets a personalized reply that would otherwise be impossible with the button-based system.
The best use of this open text technology also allows for continuation of the conversation. Similar to how you’ll initiate the next step in a conversation based on what was previously said, a conversational AI bot is able to continue the conversation with you by predicting what you’d like to chat about next.
This creates the feeling of a natural conversational flow. Open text is the preferable option for a truly personalized, user-centric interaction.
Is there more to do?
The field of conversational AI is just beginning. The implementation of open text allows for the free-form conversations humans desire, but to reach true human-like communication we must address the more nuanced aspects of communication. Chatbots are beginning to use and interpret tone, emotion, and empathy to move closer towards that human-like conversational experience. As this technological capability develops, we will certainly see a growth in the creative uses for conversational AI, and an expansion in the current markets of finance, retail, and travel.
But what about in healthcare? Healthcare is an industry both gifted and burdened with mass amounts of information; gifted with so much knowledge to share, but burdened with the issue of accessing and sharing that knowledge. Chatbots are a useful tool to ease those difficulties in healthcare, and conversational AI bots are ideally situated to fill this gap, reduce strain on healthcare practitioners, and improve patient understanding.
Curious to know more? Look out for our next article, where we discuss the exciting potential for conversational AI in healthcare.