Text Utterance Collection

Why Your Conversational AI Needs Good Utterance Data?

Have you ever wondered how chatbots and virtual assistants wake up when you say, ‘Hey Siri’ or ‘Alexa’? It is because of the text utterance collection or triggers words embedded in the software that activates the system as soon as it hears the programmed wake word.

However, the overall process of creating sounds and utterance data isn’t that simple. It is a process that must be carried out with the right technique to get the desired results. Therefore, this blog will share the route to creating good utterances/trigger words that work seamlessly with your conversational AI.

What are Utterances?

Utterances can be referred to as phrases or trigger words used to activate an artificially intelligent model. When your AI model detects its wake word, it automatically starts recording the user’s next request and replies with a suitable action or reply.

Utterance uses the concept of deep learning to teach the software how to recognize wake words. Once wake word activates the software, the system starts capturing, decoding, and servicing the request. When not in use, the system passively keeps listening for trigger words.

For your AI software to derive accurate results, capturing a plethora of different utterances for every intent is essential. It helps in better training for the AI model.

[Also Read: Would you like to know how Siri and Alexa Understand You?]

Points to Remember While Creating a Repository of Utterances

Now that we know that training is important for AI models, the next thing to know is how to provide utterances to the AI models. Usually, a repository of utterances is created to train conversational AIs.

However, there are various things to remember while building repositories of utterances. Following are the things to consider:

Points To Remember To Collect Good Utterances

User Intent

Foremostly while preparing utterances for your AI model, ensure you understand the user intent for which you are developing the datasets. You need to figure out the different utterances that users may enter while conversing with the AI model.

Variation of Utterances

Variations are an essential part of this process, as the more variations for each intent, the better results you will achieve. So, make sure to create multiple variations of user utterances. You can do it by

  • Creating short, medium, and large sentences for the same sentences.
  • Changing the words and length of sentences.
  • Using unique words.
  • Pluralizing the sentences.
  • Mixing up the grammar.

Utterances Are Not Always Well Formed

Most people have the habit of using fragmented sentences in their conversations. When dealing with robots, they wish to have the same convenience. That is why you should not only include the full structured sentences but also add typos, misspellings, and loosely said sentences in your training data.

Leverage Representative Terms and References

When ​​creating utterances, use standard terminology and references that most people understand. Remember, you do not have to build a great robot that uses sophisticated language that only experts can get. Instead, focus on formulating utterances that are highly common and easily understandable by everyone.

Vary Phrases and Terminology

A common mistake that many AI trainers often make is they use a variety of sentences but do not change the keywords in them. For instance, suppose you create utterances such as ”In which room is the television?”, “Where is the television located?”, “where will I find the television?”.

The sentences may change in all these utterances, but the root word ’television’ remains the same. So you need to ensure that you use variations for everything you enter. So instead of television, you can use synonyms for the word.

Example Utterances For Each Intent

Example utterances are assigned for each intent you have planned. Most AI training platforms suggest adding at least 10-15 utterances per intent. Fortunately, most development environments let you add utterances, create and test the model, and revisit your utterances.

So the best practice for the right entity extraction and correct intent prediction is first to add a few utterances, test them, and then add on the other inputs.

Testing & Review in Real-Life Scenarios

Testing, the AI model is crucial for it to be perfect. However, it is best to test the model against different groups of people who do not know much about the project.

It will bring out the vulnerabilities that aren’t usually detected by your team, as your team has a common understanding of the AI model you are designing.

Apart from that, we also have a continuous review of user utterances. It will showcase the performance of AI models, and you will be able to update the model with better reforms and data.


Eventually, several factors contribute to the success of your conversational AI. Therefore, it is best to get the model trained from a professional service that understands the intricacies of the project. It will be your best shot to train your model for perfection. You may contact our Shaip team to discuss your requirements and learn about our process.

[Also Read: The Complete Guide to Conversational AI]

Social Share